325 research outputs found

    Combined genotype and haplotype tests for region-based association studies

    Get PDF
    10.1186/1471-2164-14-569BMC Genomics141-BGME

    Review and meta-analysis of genetic polymorphisms associated with exceptional human longevity

    Get PDF
    Background Many factors contribute to exceptional longevity, with genetics playing a significant role. However, to date, genetic studies examining exceptional longevity have been inconclusive. This comprehensive review seeks to determine the genetic variants associated with exceptional longevity by undertaking meta-analyses. Methods Meta-analyses of genetic polymorphisms previously associated with exceptional longevity (85+) were undertaken. For each variant, meta-analyses were performed if there were data from at least three independent studies available, including two unpublished additional cohorts. Results Five polymorphisms, ACE rs4340, APOE ε2/3/4, FOXO3A rs2802292, KLOTHO KL-VS and IL6 rs1800795 were significantly associated with exceptional longevity, with the pooled effect sizes (odds ratios) ranging from 0.42 (APOE ε4) to 1.45 (FOXO3A males). Conclusion In general, the observed modest effect sizes of the significant variants suggest many genes of small influence play a role in exceptional longevity, which is consistent with results for other polygenic traits. Our results also suggest that genes related to cardiovascular health may be implicated in exceptional longevity. Future studies should examine the roles of gender and ethnicity and carefully consider study design, including the selection of appropriate controls

    A modified hyperplane clustering algorithm allows for efficient and accurate clustering of extremely large datasets

    Get PDF
    Motivation: As the number of publically available microarray experiments increases, the ability to analyze extremely large datasets across multiple experiments becomes critical. There is a requirement to develop algorithms which are fast and can cluster extremely large datasets without affecting the cluster quality. Clustering is an unsupervised exploratory technique applied to microarray data to find similar data structures or expression patterns. Because of the high input/output costs involved and large distance matrices calculated, most of the algomerative clustering algorithms fail on large datasets (30 000 + genes/200 + arrays). In this article, we propose a new two-stage algorithm which partitions the high-dimensional space associated with microarray data using hyperplanes. The first stage is based on the Balanced Iterative Reducing and Clustering using Hierarchies algorithm with the second stage being a conventional k-means clustering technique. This algorithm has been implemented in a software tool (HPCluster) designed to cluster gene expression data. We compared the clustering results using the two-stage hyperplane algorithm with the conventional k-means algorithm from other available programs. Because, the first stage traverses the data in a single scan, the performance and speed increases substantially. The data reduction accomplished in the first stage of the algorithm reduces the memory requirements allowing us to cluster 44 460 genes without failure and significantly decreases the time to complete when compared with popular k-means programs. The software was written in C# (.NET 1.1)

    Gene-based multiple trait analysis for exome sequencing data

    Get PDF
    The common genetic variants identified through genome-wide association studies explain only a small proportion of the genetic risk for complex diseases. The advancement of next-generation sequencing technologies has enabled the detection of rare variants that are expected to contribute significantly to the missing heritability. Some genetic association studies provide multiple correlated traits for analysis. Multiple trait analysis has the potential to improve the power to detect pleiotropic genetic variants that influence multiple traits. We propose a gene-level association test for multiple traits that accounts for correlation among the traits. Gene- or region-level testing for association involves both common and rare variants. Statistical tests for common variants may have limited power for individual rare variants because of their low frequency and multiple testing issues. To address these concerns, we use the weighted-sum pooling method to test the joint association of multiple rare and common variants within a gene. The proposed method is applied to the Genetic Association Workshop 17 (GAW17) simulated mini-exome data to analyze multiple traits. Because of the nature of the GAW17 simulation model, increased power was not observed for multiple-trait analysis compared to single-trait analysis. However, multiple-trait analysis did not result in a substantial loss of power because of the testing of multiple traits. We conclude that this method would be useful for identifying pleiotropic genes

    SMART: Unique splitting-while-merging framework for gene clustering

    Get PDF
    Copyright @ 2014 Fa et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Successful clustering algorithms are highly dependent on parameter settings. The clustering performance degrades significantly unless parameters are properly set, and yet, it is difficult to set these parameters a priori. To address this issue, in this paper, we propose a unique splitting-while-merging clustering framework, named “splitting merging awareness tactics” (SMART), which does not require any a priori knowledge of either the number of clusters or even the possible range of this number. Unlike existing self-splitting algorithms, which over-cluster the dataset to a large number of clusters and then merge some similar clusters, our framework has the ability to split and merge clusters automatically during the process and produces the the most reliable clustering results, by intrinsically integrating many clustering techniques and tasks. The SMART framework is implemented with two distinct clustering paradigms in two algorithms: competitive learning and finite mixture model. Nevertheless, within the proposed SMART framework, many other algorithms can be derived for different clustering paradigms. The minimum message length algorithm is integrated into the framework as the clustering selection criterion. The usefulness of the SMART framework and its algorithms is tested in demonstration datasets and simulated gene expression datasets. Moreover, two real microarray gene expression datasets are studied using this approach. Based on the performance of many metrics, all numerical results show that SMART is superior to compared existing self-splitting algorithms and traditional algorithms. Three main properties of the proposed SMART framework are summarized as: (1) needing no parameters dependent on the respective dataset or a priori knowledge about the datasets, (2) extendible to many different applications, (3) offering superior performance compared with counterpart algorithms.National Institute for Health Researc

    Association tests for rare and common variants based on genotypic and phenotypic measures of similarity between individuals

    Get PDF
    Genome-wide association studies have helped us identify thousands of common variants associated with several widespread complex diseases. However, for most traits, these variants account for only a small fraction of phenotypic variance or heritability. Next-generation sequencing technologies are being used to identify additional rare variants hypothesized to have higher effect sizes than the already identified common variants, and to contribute significantly to the fraction of heritability that is still unexplained. Several pooling strategies have been proposed to test the joint association of multiple rare variants, because testing them individually may not be optimal. Within a gene or genomic region, if there are both rare and common variants, testing their joint association may be desirable to determine their synergistic effects. We propose new methods to test the joint association of several rare and common variants with binary and quantitative traits. Our association test for quantitative traits is based on genotypic and phenotypic measures of similarity between pairs of individuals. For the binary trait or case-control samples, we recently proposed an association test based on the genotypic similarity between individuals. Here, we develop a modified version of this test for rare variants. Our tests can be used for samples taken from multiple subpopulations. The power of our test statistics for case-control samples and quantitative traits was evaluated using the GAW17 simulated data sets. Type I error rates for the proposed tests are well controlled. Our tests are able to identify some of the important causal genes in the GAW17 simulated data sets

    Genetic determinants of cortical structure (thickness, surface area and volumes) among disease free adults in the CHARGE Consortium

    Get PDF
    Cortical thickness, surface area and volumes (MRI cortical measures) vary with age and cognitive function, and in neurological and psychiatric diseases. We examined heritability, genetic correlations and genome-wide associations of cortical measures across the whole cortex, and in 34 anatomically predefined regions. Our discovery sample comprised 22,824 individuals from 20 cohorts within the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium and the United Kingdom Biobank. Significant associations were replicated in the Enhancing Neuroimaging Genetics through Meta-analysis (ENIGMA) consortium, and their biological implications explored using bioinformatic annotation and pathway analyses. We identified genetic heterogeneity between cortical measures and brain regions, and 160 genome-wide significant associations pointing to wnt/β-catenin, TGF-β and sonic hedgehog pathways. There was enrichment for genes involved in anthropometric traits, hindbrain development, vascular and neurodegenerative disease and psychiatric conditions. These data are a rich resource for studies of the biological mechanisms behind cortical development and aging

    Reordering Hierarchical Tree Based on Bilateral Symmetric Distance

    Get PDF
    BACKGROUND: In microarray data analysis, hierarchical clustering (HC) is often used to group samples or genes according to their gene expression profiles to study their associations. In a typical HC, nested clustering structures can be quickly identified in a tree. The relationship between objects is lost, however, because clusters rather than individual objects are compared. This results in a tree that is hard to interpret. METHODOLOGY/PRINCIPAL FINDINGS: This study proposes an ordering method, HC-SYM, which minimizes bilateral symmetric distance of two adjacent clusters in a tree so that similar objects in the clusters are located in the cluster boundaries. The performance of HC-SYM was evaluated by both supervised and unsupervised approaches and compared favourably with other ordering methods. CONCLUSIONS/SIGNIFICANCE: The intuitive relationship between objects and flexibility of the HC-SYM method can be very helpful in the exploratory analysis of not only microarray data but also similar high-dimensional data

    A meta-analysis of genome-wide association studies of growth differentiation Factor-15 concentration in blood

    Get PDF
    Blood levels of growth differentiation factor-15 (GDF-15), also known as macrophage inhibitory cytokine-1 (MIC-1), have been associated with various pathological processes and diseases, including cardiovascular disease and cancer. Prior studies suggest genetic factors play a role in regulating blood MIC-1/GDF-15 concentration. In the current study, we conducted the largest genome-wide association study (GWAS) to date using a sample of ∼5,400 community-based Caucasian participants, to determine the genetic variants associated with MIC-1/GDF-15 blood concentration. Conditional and joint (COJO), gene-based association, and gene-set enrichment analyses were also carried out to identify novel loci, genes, and pathways. Consistent with prior results, a locus on chromosome 19, which includes nine single nucleotide polymorphisms (SNPs) (top SNP, rs888663, p = 1.690 × 10-35), was significantly associated with blood MIC-1/GDF-15 concentration, and explained 21.47% of its variance. COJO analysis showed evidence for two independent signals within this locus. Gene-based analysis confirmed the chromosome 19 locus association and in addition, a putative locus on chromosome 1. Gene-set enrichment analyses showed that the“COPI-mediated anterograde transport” gene-set was associated with MIC-1/GDF15 blood concentration with marginal significance after FDR correction (p = 0.067). In conclusion, a locus on chromosome 19 was associated with MIC-1/GDF-15 blood concentration with genome-wide significance, with evidence for a new locus (chromosome 1). Future studies using independent cohorts are needed to confirm the observed associations especially for the chromosomes 1 locus, and to further investigate and identify the causal SNPs that contribute to MIC-1/GDF-15 levels
    corecore